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• if q IS the i-ih root m Q then h(q) is the i-th root m Q' ; 

• for every q ^ Q, 0(q) C 6'(h(q)); 

• for every q ^ Q and f £ Feats, if 6(q, /) J, then h(6(q, /)) = 6'(h(q), /), 

Definition A. 31 (Unification of multi-rooted structures) Let Si =< Qi,Gi > and 
S2 =< Q'ZjG'z > be MRSs such that Qi n Q2 = (f) and \ Qi | = | Q2 \. Let Ri be the least 
equivalence relation on Qi U Q2 such that 

• for every i, 1 < i <| Qi |, gi, ^ q-j, 

• ^lilit f) ~ ^2(q2, f) if both are defined and qi Ri 52 

The unification of Si and S2 is a new multi-rooted structure S =< Q,G >, where Q fl 
(Qi U Q2) = (f) and \ Q | = | Qi \, defined as follows: 

• Q,6 and 6 are defined as m feature structure unification 

• for every i, 1 < « <| Q |, qi is the equivalence class of qi^ (and of ([2 J 

An algorithm for the unification of MRSs can be devised on top of the feature structure 
unification algorithm in the natural way: given two MRSs, unify the feature structures that 
are defined by the first roots, then by the next roots etc., until all pairs of feature structures 
were unified. It is easy to see that the order of the feature structure unification is irrelevant. 

It is also apparent that the unification of two MRSs is the most general MRS that is 
more specific than them both, just as is the case with feature structures. 

Definition A. 32 (Rules) A rule is a MRS with a distinguished last element. Lf < 
Xq, ■ ■ ■ , X„-i, X„ > IS a MRS then < Xq, ■ ■ ■ , X„-i > is its body and X„ is its head. 
We write such a rule as < Xq, ■ ■ ■ , X„-i => X„ > 

Definition A. 33 (Grammars) A grammar is a finite set of rules. 

B List of Machine Instructions 

The following table lists, for quick reference, the machine instructions and functions, accom- 
panied by a reference to the page in the text in which they are described. 



Query processing 

put_n.ode t/n, Xi 12 

put_arc Xi , offset ,Xj 12 

advance_q Xi 17 

put_disj Xi ,11 20 

Disjunction manipulation 

loop_start Xi,l,l' 23 

loop_end 1 23 

begin_disj Xi,ii,l 26 

next_disj Xi ,1 26 

end_disj 26 



Program processing 

get_structure t/ii,Xi 13 

unif y_variable Xi 13 

unify_value Xi 13 

advance_p Xi 18 

Auxiliary functions 

deref (a) : address 13 

unif y(addrl ,addr2) :boolean 16 
add_disj jrecord(l,l ' ,addr ,n, i) : void 24 

rearrange_disj () : void 24 

failO 24 
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Theorem A. 27 The result of the unification algorithm (without 'fill') is the most general 

feature structure that is subsumed by the unification arguments. 

Proof: 

Suppose that fs was returned by unify(qi, ([2) and that there exists a feature structure fs' 

such that fsi C fs' and fs2 C fs' . Then there exists a subsumption morphism h' : QiU 

Q2 -^ Q' ■ Consider the function h" : Q ^ Q' defined as follows: 

h"(q) = h'(p) where h(p) = q 

By the propositions above it is easy to see that h" is a subsumption morphism, i.e., that 
fsQfs'. 

A. 5 Disjunction 

We allow feature structures to be disjunctive: a disjunctive term is a set of terms, where 
we usually use ';' to separate elements of the set. 

Definition A. 28 (Unification of disjunctive terms) Let ipi = {ipl\ ■ ■ ■ \ipi} and ip2 = 
{ipll ' ' ' IV"™} be terms containing disjunction. Then i/^i U i/'2 = {ip\ ^ ipi I 1 ^ * ^ '^j 1 ^ 
j < m} \ {T}, If the result is a singleton, we write {ip} as ip; if the result is empty, the 
unification fails. 

A. 6 Multi-rooted Structures 

Definition A. 29 (Multi-rooted Structures) A multi-rooted structure (MRS) is a 
pair < Q,G > where G is a finite, directed, labeled graph consisting of a set Q of nodes, a 
partial function 6 : Q x Feats -^ Q specifying the arcs and a total function 6 : Q ^ Types 
labeling the nodes, and where Q is an ordered, non-empty list of distinguished nodes m Q 
called roots, A certain node q can appear more than once m Q. G is not necessarily 
connected, but the union of all the nodes reachable from all the roots m Q is required to yield 
exactly Q. The length of a MRS is the number of its roots, \ Q \. 

We use S, R (with or without tags, subscripts etc.) to denote MRSs. We use 6, 6, Q and 
Q (with the same tags or subscripts) to refer to the constituents of MRSs. 

If < Q, G > is a MRS and qi is a root in Q then qi defines a feature structure in the 
natural way: this feature structure is (Qi, qi, 6i,0i) where Qi is the set of nodes reachable 
from qi, 6i is the restriction of 6 to Qi and 6i is the restriction of 6 to Qi. 

In view of this notion we can refer to a MRS < Q,G > as an ordered sequence < 
fsi,fs2,...,fsn > of (not necessarily disjoint) feature structures, where each root in Q 
is the root of the corresponding feature structure and < Q,G > can be determined by 
< fsi , . . . , fs„ >. Note that such an ordered list of feature structures is not a sequence in 
the mathematical sense: removing an element from the list effects the other elements (due 
to value sharing among elements). Nevertheless, we can think of a MRS as a sequence where 
a subsequence is obtained by taking a subsequence of the roots and considering only the 
feature structures they induce. We use the two referencing methods interchangeably in the 
sequel. 

We extend the linear representation of feature structures to MRSs in the natural way, 
where ',' separates two consecutive structures of the MRS. We also extend the notion of 
normal terms to MRSs by requiring that only a first occurrence of some tag within the MRS 
be dependent. 

We extend the notion of subsumption to MRSs in the following way: 

Definition A. 30 (Subsumption of multi-rooted structures) A MRS < Q,G > sub- 
sumes a MRS < Q' , G' > if \ Q | = | Q' \ and there exists a total function h : Q ^ Q' such 
that: 
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these calls are eliminated, the algorithm returns the unification of its arguments as defined 
above. 

The following propositions will help in understanding the algorithm. 

Proposition A. 20 The function h, defined by the algorithm, is total. 
Proof: 

The unification algorithm starts with fsi and fs2 and defines h for their roots. Then the 
daughters of the roots are scanned, and 'unify' is called recursively with each daughter if h 
isn't defined for it. Since both fsi and fs2 are connected, it is guaranteed that h will be 
defined over the entire domain when the algorithm ends. 

Proposition A. 21 If h(q) = q' then e(q) E e(q'). 
Proof: 

h(q) IS being set by 'unify' when q is unified with some other node to produce q' . The type 
of q' IS the least upper bound of the types of q and the other node, hence it is subsumed by 
the type of q. The same holds when h is being redefined, as the new value of h(q) has a type 
that IS subsumed by the type of the old value. Finally, if h is being set by 'copy', then the 
type of h(q) equals the type of q. 

Proposition A. 22 If an f -labeled arc connects nodes u and v m Qi or m Q2 then such an 

arc connects nodes h(u) and h(v) m Q. 

Proof: 

Immediate from the construction. 

Theorem A. 23 The result of the unification algorithm is subsumed by both of its arguments. 
Proof: 

The morphism h defined by the algorithm was proved by the above propositions to cohere 
with the subsumption requirements. 

Note that h defines an equivalence relation on Qi U Q2 which holds for a pair (qi, 52) iff 
h(qi) = h{q2). In fact, h{q) is the equivalence class of g with respect to the Ri relation. This 
relates our algorithm to Definition A. 17. 

Let us now assume that the algorithm doesn't issue the call to 'fill'. For this modified 
algorithm, the following propositions hold: 

Proposition A. 24 If qi, . . . , q„ G Qi ^ Q2 o-tc such that for every i, h(qi) = q, then 

0(q) = U0(q,). 

Proof: 

Just like the proof of Proposition A. 21. 

Proposition A. 25 For every node q ^ Q there exists a node g' G Qi U Q2 such that 

h(q') = q. 

Proof: 

When a node node q is introduced by 'copy(q')', h(q') is set to q. When a new node q is 
introduced by 'unify', then if either qi or q2 are members of Q1UQ2, h is being set for then, 
and its value is q. Otherwise, h is being re-defined for some nodes and its new value is the 
new node. 

Proposition A. 26 If 6(q, /) = q' for q,q' ^ Q then there exist nodes p,p' ^ Qi U Q2 such 

that h(p) = q, h(p') = q' and either 6i(p) = p' or 62(p) = p' . 

Proof: 

It IS clear from the construction that arcs are only being added to the result on account of 
corresponding arcs m one of the unificands. 
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unify (gi,g2): 

q ^ new_node(); 

t ^ e{qi) U ^(^2); if t = T then return fail; else e{q) ^ t; 
if qi G (Qi U Q2) then /i(gi)^ g; else substitute{q, gi); 
if 52 G (Ql U Q2) then /i(g2)^ "Si else substitute{q, 52); 

for all outgoing edges ei of gi , 

if ei is labeled /i and no /i -labeled edge leaves 52? 

then create an /i-labeled outgoing edge in q and set it to point to copy{S{qi, /i)); 
for all outgoing edges 62 of q2, 

if 62 is labeled /2 and no /2-labeled edge leaves qi, 

then create an /2-labeled outgoing edge in q and set it to point to copy{S{q2, /2))i 

for all features / such that /-labeled arcs are leaving both qi and q2, 
create an /-labeled edge in q and set it to point to: 

ii h(S(qi,f)) = q[ and h(S(q2, f)) = q'^ then/* his defined for both! */ 

if gj = q'^ then gj else unifyi^q'^^q'^)'-, 
if /i(5(gi,/)) = q[ and/i(5(g2,/)) T then 

'U-nily(q[,S(q2,S)); 
if /i(5(g2,/)) = q'2 and/i(5(gi,/)) T then 

'u-nily(q2,S(qi,j)); 
if /i(5(gi,/)) T and h(5(q2, })) T then 

unijy(S(qi, j),S(q2, S)); 
iM(q); 
return q; 



Figure 32: The unification algorithm 



substitute(new,old): 

for every q and every / such that 5(g, /) = o/rf, 5(g, /)^ new^ 
for every q such that /i(g) = o/rf, h{q)^^ new; 

copy(q): 

if /i(g) I then return /i(g) 

else let q' be a new node with the type 0{q) 
set h{q) = g'; 

for every /-labeled arc that leaves q, create an /-labeled arc in q' 
and set its value to copy{S{q, f)); 



fill(q): 



for all features / that are appropriate for 0{q), if S{q, /) | then 
^(•S?/) ^ new_node(); 
e(S{q,f))^approp{f,e{q)y, 

fmHqJ)); 



Figure 33: The unification algoritlim - auxiliary functions 



The function h associates the arguments' nodes with nodes in the result. It is being 
redefined during the algorithm whenever a node q that was already mapped to some image 
in the result is being unified again. In this case, a new node is created and the image of 
q has to be redefined. The morphism h helps in determining the condition for halting the 
recursion: if two nodes are being unified and both their images exist and are equal, there is 
no need in getting on with the recursion. 

The calls to 'fill' ensures that if the arguments of the unification are totally well-typed 
and the appropriate specification contains no loops, the result is also totally well-typed. If 
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A. 4 Unification 

Let us now abstract away from the identity of specific nodes in feature structures by identi- 
fying alpliabetic variants. Unification will be defined for representatives of the equivalence 
classes of all feature structures (with respect to alphabetic variance); its result will again be 
such an equivalence class representative. We use the term 'unification' to refer to both the 
operation and its result. 

Definition A. 17 (Unification) Let fsi = (Qi, qi, 6i,6i) and fs2 = (Q2, ([2, ^2,(^2) be 
such that Qi n Q2 = (f) (or use alphabetic variants of them for which this condition holds). 
Let Ri be the least equivalence relation on Qi U Q2 such that 

• gi ~ 92 

• ^lilit f) ~ ^2{<l2, f) if both are defined and qi Ri 52 

Let [q\ai be the equivalence class of q with respect to Ki. The unification of fsi and fs2, 
fsi U fs2, IS a new feature structure fs = (Q, q, 6, 9), defined as follows: 

• Q IS the set of equivalence classes of Qi U Q2 with respect to Ri 

• 1= [9i]«(= fe]«) 

• 9{q) IS the least upper bound of {9i{qi) \ qi ^ Qi and q = [(/ij^} U {6*2(52) | <l2 £ 
Q2 andq = [q2]a} 

• H<l,f) = 1' «/ (h{qi,f) = q'l and q = [qi]~ and q' = [q'i]a) or if (b2{q2, f) = q'j and 
q = [q2]~ and q' = [qij]-) 

We say that the unification fails if there exists a node q^Q for which 9{q) = T, 

Theorem A. 18 fsi \-\fs2 is the least upper bound of fsi and fs2 with respect to subsump- 

tion, if an upper bound exists. 

Proof: 

See [9]. 

In the following algorithm we assume the existence of an infinite set of nodes from which 
a unique new node can always be drawn. 

Algorithm A. 19 (Unification) The unification of fsi and fs2 is obtained by calling 
the function 'unify' (Figures 32 and 33) with qi and ([2 and considering, as the result, the 
graph whose root was returned by the function and whose nodes are all the nodes reachable 
from that root. 

The algorithm assumes that both the arguments and the result reside in memory, repre- 
sented as graphs, so that when given the root, the function can access all other nodes. As 
a part of its operation the function defines a morphism h : (Qi U Q2) -^ Q that associates 
each node of the arguments with a node in the result. 

Since Q, Qi and Q2 are disjoint we use 6 and 9 without subscripts where the appropriate 
function can be determined by the identity of its arguments. 

The algorithm starts by first unifying the roots of the two structures. Unifying two 
nodes is done by creating a new node, with the unification of the arguments' types as its 
type, and modifying the function h accordingly (see below). If any of the arguments is not 
a member of QiU Q2, it is replaced by the new node. The outgoing edges of the arguments 
are then taken care of: those whose labels were unique to one of the arguments only are 
simply copied to the new node with their values using the function 'copy'. For those whose 
labels are common to both arguments, the unification is called recursively. 
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A. 3 Correspondence of Feature Structures and Terms 

We are using terms to represent feature structures. We define below an algebra over which 
terms are to be interpreted. The denotation of a normal term is a totally well-typed feature 
structure. 

Definition A. 16 (Feature structure algebra) A feature structure algebra is a st- 
ructure A =< Da, {tA 1 1 £ Types}, {/a \ f £ Feats} >, such that: 

• Da ts a non-empty set, the domain of A; 

• for each t G Types, tA C Da and, m particular: 

- T^ = (j>; 

- ±A = Da; 

- if ti Ut2 = t then tiA n t2A = ^A 

• for each f G Feats, /^ is a total function fA '■ Da -^ Da 

Let Dg be the domain of all typed feature structures over Types and Feats. The 
interpretation of to over this domain is the set of feature structures whose roots have a type 
t' such that t \Zt' ; the interpretation of fa '■ Dq -^ Dq is the function that, given a feature 
structure fs, returns val(fs,f). 

With each normal term ip we associate a totally well-typed feature structure fs in the 
following way: 

• if i/) = [i]t() then fs = ({[i]}, [i], i5|, 6'() where (5| is undefined for every input and 
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[{ ip = [i]t(ri, . . . , T„) then fs = (Q, [i], 6, 9) where 6'([i]) = t and for every j, if fj is 
the j-th appropriate feature of the type t, then (5([i], ff) = qj and qj is the root of the 
feature structure associated with Tj . Q is {[i]}UlJ- Qj where Qj is the set of nodes in 
the feature structure associated with Tj . 

Conversely, with each feature structure fs = (Q,q,6,6) we associate a normal term 

-iP = [i]t(Ti, ...,T„) where: 

• [i] = s; 

• t = e{q); 

• n is the number of outgoing edges from q; 

• for every j, 1 < i < «, tj is the term associated with 6(q, fj) where fj is the j-th 
appropriate feature oft; 

• if the tag [i] occurs elsewhere in ri , . . . , r„ , we replace the term that [i] depends on with 
the term _L(), making this occurrence of [i] independent. 

and if a tag [j] occurs more than once in the term thus constructed, we replace all but its 
first occurrence with _L(). 

To summarize, there is a one-to-one correspondence between totally well-typed feature 
structures and normal terms. In the sequel we use both representations interchangeably. 

Note that the tags are only a means of encoding reentrancy in feature structures. There- 
fore, when displaying a term in which a tag [i] appears just once in a term, we will sometimes 
omit the tag for the sake of compactness. Then, we sometimes omit the type of independent 
tags, which are implicitly typed by _L, and display them as tags only. 
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• for every q E Qi, Oi{q) C 92{h{(l)) 

• for every q E Qi and for every f such that 6i(q, /) J,, h(6i(q, /)) = 62(h(q), /) 

i.e., h maps every node in Qi to a node in Q2 such that the type of the first node subsumes 
the type of the second, and if an arc labeled / connects q and q' in Qi, then such an arc 
connects h(q) and h(q') in Q2. 

Definition A. 11 (Alphabetic Variants) Two feature structures fsi and fs2 are alpha- 
betic variants (fsi ~ fs2) tff fsi C fs2 and fs2 C fsi. 

Alphabetic variants have exactly the same structure, and corresponding nodes have the same 
types. The identities of the nodes are what tell them apart. 

A. 2 A Linear Representation of Feature Structures 

Representing feature structures as either graphs or attribute- value matrices is cumbersome; 
we now define a linear representation for feature structures, based upon Ai't-Kaci's i/i-terms. 

Definition A. 12 (Arity) The arity of a type t is the number of features appropriate for 
it, I.e. \{f I Approp{f,t) i}\. 

Note that in every totally well-typed feature structure of type t the number of edges leaving 
the root is exactly the arity of t. Consequently, we use the term 'arity' for (totally well- 
typed) feature structures: the arity of a feature structure of type t is defined to be the arity 
oft. 

Let {[i] I i is a natural number} be the set of tags. 

Definition A. 13 (Terms) A term r of type t is an expression of the form [i]t(Ti, . . .,t„) 
where [i] is a tag, n > and every Ti is a term of some type. If n = we sometimes omit 
the '()'. 

Definition A. 14 (Totally well-typed terms) A term r = [i]t(Ti, . . . , r„) of type t is to- 
tally well-typed iff: 

• t IS a type of arity n; 

• the appropriate features for the type t are fi, . . . , f„, m this order; 

• for every i, !<«<«, Approp(fi,t) [; 

• for every i, !<«<«, if Ti is a term of type t'- and Approp(fi,t) = ti then either 
ti E t'i or t'i = ± 

We distinguish tags that appear in terms according to the type they are attached to: if a 
sub-term consists of a tag and the type _L, we say that the tag is independent. Otherwise, 
the tag is dependent. We will henceforth consider only terms that are normal: 

Definition A. 15 (Normal terms) A totally well-typed term ip = [i]t(Ti, . . .,t„) is nor- 
mal iff: 

• t^T; 

• if a tag [j] appears m ip then its first (leftmost) occurrence might be dependent. If it 
appears more than once, its other occurrences are independent. 

• Ti, . . . ,T„ are normal terms. 
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i.e., every feature is introduced by some most general type, and is appropriate for all its 
subtypes; and if the appropriate type for a feature in ti is some type t, then the appropriate 
types of the same feature in ^2, which is a subtype of ti, must be at least as specific as t. 

If Approp(f, t) I we say that / is appropriate for t and that Approp(f, t) is the appropriate 
type for the feature fin the type t. We assume that the set of features appropriate for some 
type is ordered (recall that Feats is ordered). 

Definition A. 4 (Well-typed feature structures) A feature structure (Q, q, 6, 9) is well 

typed iff for all f G Feats and q ^ Q, tf6(q, f) I then Approp(f, 0(q)) J, and Approp(f, 0(q)) 

i.e., if an arc labeled / connects two nodes, then / is appropriate for the type of the 
source node; and the appropriate type for / in the type of the source node subsumes the 
type of target node. 

Definition A. 5 (Total well-typedness) A feature structure is totally well-typed iff it 

IS well typed and for all f G Feats and q ^ Q, if Approp(f, 0(q)) J, then 6(q, /) J,, 

i.e., every feature which is appropriate for the type labeling some node must imply the 
existence of an outgoing arc labeled by this feature. 

Definition A. 6 (Appropriateness Loops) The appropriateness specification contains a 
loop if there exist ti,t2, ■■■ ,t„ G Types such that for every i, !<«<«, there is a feature 
fi G Feats such that Approp(fi,ti) = ti+i, where t„-\.i = ti. 

Definition A. 7 (Paths) A path is a sequence of feature names, and the set Paths = 
Feats* denotes the collection of paths. The definition of 6 is extended to paths m the natural 
way: 

6(q, e) = q (where e is the empty path) 

6(q,flT) = 6(6(q,f),lT) 

Definition A. 8 (Path Values) The value of a path it m a feature structure fs = 
(Q, q, 6,6), denoted by val{fs,T:), is non-trivial if and only if b{q,T:) [, m which case 
it IS a feature structure fs' = {Q' , q' , 6' , 6'), where: 

• q' = S{q, 7r) 

• Q' = {q' I there exists a path it' such that b{q' ,t:') = g'} (Q' is the set of nodes reach- 
able from q' ) 

• for every feature f and for every q' G Q' , 6'(q' , /) = 6(q' , /) (6' is the restriction of 6 
toQ') 

• for every q' G Q' , O'(q') = O(q') (6' is the restriction of 6 to Q' ) 
If b{q,T:) ], val{fs,T:) is defined to be a single node whose type is T, 

Definition A. 9 (Reentrancy) A feature structure fs is reentrant if there exist two non- 
empty paths TTi, 7r2 such that 6(q, tti) = 6(q, 172). In this case the two paths are said to share 
the same value. 

Definition A. 10 (Subsuniption) fsi = (Qi, qi, 61,61) subsumes fs2 = (Q2, ([2, 62,62) 
(fsi E fs2) iff there exists a total function h : Qi ^ Q2, called a subsuniption nior- 
phisni, such that 

• h{(li) = 92 
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A Theory of Feature Structures 

This section gives a brief survey of the theory underlying our design. We follow Carpenter 
([7, 9]) in the presentation of the basic building blocks of the TFS theory. The linear 
representation of terms is based upon Ait-Kaci ([5]). We then give a procedural definition 
for the unification operation which is parallel to Carpenter's definition. We extend the 
notion of a feature structure to sequences of feature structures; these sequences will be used 
for representing phrasal signs and rules. 

A.l Types and Feature Structures 

For the following discussion we fix non-empty, finite, disjoint sets Types and Feats of types 
and feature names, respectively. We assume that the set Feats is totally ordered. 

A word concerning partial functions is in order here: we use the symbol 'J,' (read: 'is 
defined') to denote that a partial function is defined for some value and the symbol '|' 
(read: 'is not defined') to denote the negation of 'J,'. Whenever the comparison operator '=' 
is applied to the result of an application of a partial function, it is meant that the equation 
holds iff both sides are defined and equal. 

Definition A.l (Type Hierarchy) A partial order relation C over Types x Types is 
an inheritance hierarchy if it is bounded complete, i.e., if every up-bounded subset T of 
Types has a (unique) least upper bound, UT, referred to as the unification of the types m 
T. 

If ti \Z t'z we say that ti subsumes, or is more general than, t'^; ^2 is a subtype of 

ti- 

Let _L be the most general type, i.e., _L is the least upper bound of the empty set of types. 
Let T be the most specific type, i.e., T = UTypes, IfUT = T we say the T is inconsistent. 
Let r\T be the greatest lower bound of the set T. 

Definition A. 2 (Feature Structures) A feature structure fs is a directed, connected, 
labeled graph consisting of a finite set of nodes Q, a root q ^ Q, a partial function 6 : 
Q X Feats -^ Q specifying the arcs and a total node-typing function 6 : Q ^ Types, 

The nodes of a feature structure are thus labeled by types while the arcs are labeled by 
feature names. The root g is a distinguished node from which all other nodes are reachable. 
We say that a feature structure is of type t when 6(q) = t. 

Let FS be the collection of all feature structures over the given Feats and Types. 

We use fs (with or without tags, subscripts etc.) to refer to feature structures. We use 
Q, q, 6 and 6 (with the same tags or subscripts) to refer to constituents of feature structures. 

Note that all feature structures are, by definition, graphs. Some grammatical formalisms 
used to have a special kind of feature structures, namely atoms; atoms are represented in 
our framework as nodes with no outgoing edges. For a discussion regarding the implications 
of such an approach, refer to [9, Chapter 8]. 

Definition A. 3 (Appropriateness) An appropriateness specification over the type 
inheritance hierarchy and the set Feats is a partial function Approp : Feats x Types -^ 
Types, such that: 



• 



let Tf = {t e Types | Approp(f,t) [}; then for every f £ Feats, Tf ^ (f) and 
HTjeTf. 

if Approp(f,ti) I and ti C ^2 then Approp(f,t2) I and Approp(f,ti) C Approp(f,t2). 
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begin_disj X,,!!,! = 

add_disj -record (l,l,H,n,i); 

HEAP[H] ^ <OR,n>; 

H ^ H + n + 1; 

DS[D] .orig^tr ^deref(X,); 

next_disj X,,l = 

DS[D] .currjdisj ++; 

addr ^ DS [D] . or_addr + DS [D] . curr jdis j ; 

Xj ^- addr ; 

DS [D] . endJLabel ^ 1 ; 

copy (DS [D] . orig_str , addr) ; 

end_disj = 

bind (DS[D] . orig^tr ,DS [D] .or-addr) ; 
rearrange jdisj () ; 



Figure 31: Implementation of the disjunction instructions 

for such formalisms that are based on typed feature structures. The presentation made use 
of an abstract machine specifically tailored for this kind of applications. In addition, we 
described a compiler for a general TFS-based language. The compiled code, in terms of 
abstract machine instructions, can be interpreted and executed on ordinary hardware. The 
use of abstract machine techniques is expected to result in highly efficient processing. 

This project is still under development. We described here a very simple machine, capable 
of unifying two feature structures. We then extended the coverage of the machine - and 
the compiler - by allowing disjunction within feature structures and enabling unification of 
sequences of feature structures. In other words, our machine is capable of applying a single 
phrase structure rule. The next step will be the addition of control structures that will enable 
implementation of a parsing algorithm inherent to the machine. Special constructs will be 
added to select an appropriate rule out of several possible ones and to maintain temporary 
results. Future extensions might include negation, list- and set- values and special constructs 
for generation. 
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Figure 30: Code generated for the program a({b(bot,d) \a(bot,dl)},dl) 



represented using an extra cell, that can store the address of its copy; the second solution 
implies the incorporation of a hash table for temporary storing nodes of the feature struc- 
ture that is currently being copied. Since the copy operation is expected to be performed in 
other situations (e.g., when manipulating a parser chart), and many graphs are in general 
expected to be copied, it seems that the better solution is to extend the record representing 
each node so that a field for an address of its copy is added. The implementation of copy 
is, thus, straightforward. 

5 Conclusion 

As linguistic formalisms become more rigorous, the necessity of well defined semantics for 
grammar specifications increases. We presented a first step towards an operational semantics 
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function add_disj -record (l,l':label, addr: address, n, i:integer): void 
begin 

D ^ D + 1; 

DS[D] . start JLabel ^ 1; 

DS [D] . endJLabel ^ 1 ' ; 

DS [D] . or-addr ^ addr ; 

DS[D] .currjdisj ^ 0; 

DS[D] .non_fail ^ n; 

DS[D] .register ^ i; 
end; 

function rearrangejdisj () : void 
begin 

addr ^ DS [D] . or^ddr ; 
if (DS[D] .nonJail = 0) then 
D ^ D - 1; 
failO; 
if (DS[D] .nonJail = 1) then "/. eliminate OR-cell 

bind (addr,addr+l); "/. HEAP [addr] ^ <REF,addr+l> 

if (DS [D] .nonJail ^ *(addr)) then 7, some disjuncts failed 
i ^ 1; 

while (i < *(addr)) do 7, for every original disjunct 

t ^ addr+i; j ^ 1; 'L remove self-ref 

while ((HEAP[t] = <REF,t>) and (i < *(addr)) do 
HEAP[t] ^ HEAP[t+j] ; 
j ^ j + 1; i ^ i+1; 
i ^ i + 1; 
if (DS[D] .nonJail > 1) then 

HEAP[addr] ^ <OR,DS [D] .non_f ail> ; 7. update OR-cell 

D ^ D - 1; 
end; 



Figure 28: Implementation of the disjunction auxiliary functions 



function failO : void 
begin 

if (D > 0) then 

addr ^ DS [D] . or^ddr + DS [D] . currjdis j ; 
DS[D] .nonJail — ; 
HEAP [addr] ^ <REF , addr> ; 
jump DS [D] . endJLabel ; 
else 

abort ; 
end; 



Figure 29: Implementation of the fail function 
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loop-start Xj,!,!' = 

addr ^derefCX,); X, ^addr; 
if (HEAP [addr] = <OR,k>) then 

add_disj -record (1 ,1 ' ,addr ,k,i) ; 

X, ^ addr + 1 ; 



"/o first disjunct 



loop_end 1 = 

if (DS[D] .start JLabel = 1) then 
DS[D] .currjdisj ++; 
if (DS [D] . curr jdis j < *(DS [D] .or-addr) ) then 

i ^ DS [D] .register ; 7, more disjuncts left 

X, ^ *(DS[D] .or-addr + DS [D] . currjdis j ) ; 
jump 1; 
else 

rearrange_disj () ; 



Figure 27: Implementation of the loop instructions 



4.2.4 Disjunctive Programs 

When the program itself is disjunctive one cannot avoid copying the query with which it 
has to be unified. To understand this recall that, unlike Prolog, our system might return 
a disjunctive value in cases where one of the unification arguments is disjunctive. Hence, 
unification has to collect possible results, rather than pick a possible value and stick to it until 
it either fails, in which case another value is chosen, or successfully undergoes unification. 

To accommodate disjunctive programs we introduce three new instructions, namely 
begin_disj, next_disj and end_disj. These instructions are generated such that the 
code for a disjunctive program term starts with begin_disj; prior to each disjunct, in- 
cluding the first, a next_disj instruction is generated; and to conclude a disjunctive term 
we generate end_disj. For example, figure 30 shows the code that is generated for the term 
a({b(bot,d) \a(bot,dl)},dl), taken as a program. 

To enable copying of the query, we add a field to each disjunction record: orig_str stores 
the address of the original structure that we copy. The implementation of begin_disj , given 
in figure 31, is straightforward: a new disjunction record is added to DS, an OR-cell is built 
on the heap and the address of the original structure, taken from Xi, is recorded. next_disj 
copies the original structure each time it is executed. It also modifies the endJLabel field 
of the current disjunction record: this field stores the address of the instruction to jump 
to upon failure. Finally, end_disj replaces the original structure with a pointer to the 
newly-built OR structure and rearranges this OR structure much as loop_end does. 

When rearranging the OR structure we can check to see if there are nested disjunc- 
tions (i.e., if some arc leaving the OR node points to another OR node). We define 
{{«}l;->nj|---|{«tl;->nj} to be equivalent to {a} | • • • |a^ J • • • ht | • • • |a^ J. There- 
fore, in such cases the inner disjuncts can be lifted so that all of them reach the same level. 
This modification is left for an optimization process, not designed yet. 

What is left to show is the implementation of the function copy. The problem in im- 
plementing this function stems from the possibility of cycles in the feature structure to be 
copied. When scanning it, each node must be labeled, before it is being copied, by the 
address of its copy. Thus, reentrancy (and directed cycles) can be preserved. There are 
two alternatives for storing this information: it can either be kept attached to each node, 
or stored in a temporary table. The first solution implies that each node must now be 
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Figure 26: New code for the program a(e([l]d2,[l]),dl). 

non_f ail the number of disjuncts that successfully passed unification. 

A special purpose register D is used to point to the current record of DS. 

loop_start (figure 27) checks whether the node with which the program is unified is an 
OR-cell. If it is not, loop_start does nothing, so the overhead is minimal. If it is an OR-cell, 
a new record is added to the disjunction stack. Among the values stored in a disjunction 
record, the startJLabel, 1, should be noted: it uniquely identifies the get_structure 
instruction immediately following the loop_start that adds the record; it serves as a way to 
determine whether or not this get_structure instruction is matched against a disjunctive 
value: if it is, the current disjunction record has 1 as the value of start JLabel. 

loop_end receives as a parameter the label 1 of the corresponding get_structure in- 
struction. Therefore, all that loop_end has to do in order to know whether or not an OR-cell 
actually existed is to compare 1 with the start JLabel of the current disjunction record. 

loop_end then checks if more disjuncts exist. If so, the register Xi, where i is the value of 
the field register of the current record, is set to point to the current disjunct and execution 
returns to the beginning of the loop. Otherwise, rearrange_disj is called: the results of 
the disjunctive unification are inspected by considering the value of the non_f ail field in 
the current disjunction record. If it is zero, the unification fails. If it is one, the OR-cell is 
eliminated and is replaced by a REF-cell, as the result is non-disjunctive. The REF cells that 
pointed to the disjuncts are then scanned; those that are self-referential are eliminated. The 
implementation of the disjunction maintenance auxiliary functions is depicted in figure 28. 

How can such a REF cell become self-referential? To understand that, note that the 
notion of failure must be changed. Before disjunction was introduced, fail implied the 
immediate termination of processing. Now, however, failure of one disjunct does not overrule 
the possibility of successful unification of another. If failure occurs within a disjunction, the 
next disjunct must be tried. All that has to be done is mark the current disjunct as invalid, 
by transforming the pointer to it to a self-referential cell. When loop_end rearranges OR- 
cells, it eliminates self-referential cells. The implementation of fail is depicted in figure 29. 
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put_node dl/0,X9 
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Figure 25: Compiled code and heap representation for a({h(hot,d) \a(bot,dl)},dl) 



4.2.2 Unifying Disjunctive Feature Structures 

Informally, unifying two disjunctive values results in a disjunctive value, in which each dis- 
junct is the result of unifying some disjunct of one unificand with some disjunct of the other. 
If the number of non-failure results is exactly one, the result is a simple, non-disjunctive, 
feature structure; if it is zero, the entire unification fails. We can always substitute a non 
disjunctive value for a disjunction of arity one. The order of the disjuncts is irrelevant. See 
Appendix A. 5 for the formal details. 

4.2.3 Disjunctive Queries 

Consider first the case of disjunctive queries, where the program does not contain disjunc- 
tion. Each of the disjuncts has to be unified, in turn, with the program term. This calls for a 
major modification in the compiled code of programs: this code is now potentially iterative. 
Therefore, we encapsulate the code that is generated for each subterm with loop_start and 
loop_end instructions. We add a label to every get_structure instruction and a correspond- 
ing label to every loop_end. Using these labels, control can pass from the end of the loop 
to its beginning and failure recovery can take place, as will be explained below. Figure 26 
shows the code that is generated for the (non-disjunctive) program term a(e([l]d2,[l]),dl). 

To handle the iteration process we introduce an additional data structure: the disjunction 
stack, DS, stores a disjunction record for every disjunction encountered during the execution. 
Each disjunction record has the following fields: 

startJLabel the label corresponding to the get_structure instruction against which the 
disjunction is matched; 

endJLabel the label corresponding to the loop_end instruction (the instruction to jump to 
upon failure); 

register the index of the register that is allocated for the get_structure instruction; 

or_addr the heap address of the OR cell; 

curr_disj the serial number of the current disjunct; 



21 



we enclose each disjunctive value within curly brackets. We eliminate structure sharing in 
disjuncts by disallowing tags in the terms that represent them. 

Now that disjunction is introduced, the notion of failure must be changed: failure no 
longer entails the abortion of the computation, as failure of one disjunct does not overrule 
the possibility of another disjunct to succeed. A method of retrying other possibilities must 
be employed; it is described below. 

4.2.1 Building Disjunctive Feature Structures 

While the syntax of our input language was only slightly modified, a larger number of 
changes must be made in the design to accommodate disjunction. First, we must show 
how disjunctive feature structures are built on the heap. When fiattening a linear term we 
treat the '|' operator somewhat like a special type, 'OR', the arity of which is the number 
of disjuncts. Thus, the term a({b(bot,d) \a(bot,dl)},dl) is transformed to the sequence of 
equations presented in figure 23. 
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Figure 23: The disjunctive term a({b(bot,d) \a(bot,dl)},dl) as a set of equations 



A new instruction, put_disj, is introduced; its code is given in figure 24. It creates 
a special type of cell on the heap: OR-cell, containing the number of disjuncts. For each 
put_disj instruction additional put_arc instructions are generated that create a REF cell 
for each disjunct. The put_disj instructions are accumulated in the put_n.ode instructions 
stream. The code that is generated for the above term, taken as a query, is depicted in 
figure 25, along with the heap after execution of this code. 
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Figure 24: The implementation of the put_dis j instruction 
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the manipulation of a (program) rule whose body equals in length to the query. 

When compiling a rule, its body generates the same code as that of a program MRS, as 
explained in the previous section. The head of the rule is treated as a query, and hence the 
code that is produced for the head is just the code that would have been produced for a 
single feature structure query of the same form. 

For example, consider the following rule: b(b([2]d,[2]),[4]dl), a([4],[4]) => b(b([2],[4]),d2). 
The generated code for this rule is listed in figure 22. Note that the registers we use for the 
head are the same set of registers that were used for the body, to accommodate reentrancy. 



CURR_ROOT <- 
advance_p XI 
get_structure b/2, XI 
unify_var X2 
unify_var X3 
get_structure b/2, X2 
unify_var X4 
unify_value X4 
get_structure d/0, X4 
get_structure dl/0, X3 
advance_p X5 
get_structure a/2, X5 
unify_value X3 
unify_value X3 

CURR_ROOT <- 
put_node b/2, X6 

put_arc X7 

put_arc X8 
put_node b/2, X7 

put _ arc X4 

put_arc X3 
put_node d2/0, X8 
advance_q X6 



y, initialization 

y, body, first feature structure 
y. XI = b( 
y. X2, 
y. X3) 

y. X2 = b( 

y. X4, 

y. X4) 

y. X4 = d 

y. X3 = dl 

y, body, second feature structure 

y. X5 = a( 

y. X3, 

y. X3) 

y, head 

y. X6 = b( 
y. X7, 

y. X8) 

y. x7 = b( 
y. X4, 

y. X3) 

y. X8 = d2 



Figure 22: Compiled code for the program 



4.2 Disjunction 

Disjunctive values denote indeterminateness: they represent the proposition that more than 
one value is suitable for some feature. Disjunction within feature structures was discussed 
in [17, 12, 13, 14] and various kinds of disjunctive values were implemented in different 
systems. Some complexities arise when dependent disjunction is employed, i.e., when the 
disjunct chosen in one choice point has to correspond with the disjunct chosen in another; 
the interaction of disjunction with reentrancy is also problematic. For more details, see [14]. 
As our system is motivated by linguistic considerations, we choose not to maintain the 
most general notion of disjunctive values. We limit our input language so that only inde- 
pendent disjunctions are allowed, and no disjunct can be reentrant. We augment the syntax 
of linear terms by adding the '|' operator as a separator between disjuncts; for readability. 
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it. Therefore, no code is generated for the tag, and two consecutive advance_q instructions 
are created. 

Consider, for example, the following query: a([3]dl,[3]), b(b([l]d,[l]),[3]). The code that 
is generated for this query is listed in figure 20. 

CURR_ROOT <- 

put_node a/2, XI "/. XI = a( 

put_arc X1,1,X2 "/. X2, 

put_arc X1,2,X2 "/. X2) 

put_node dl/0,X2 "/. X2 = dl 
advance_q XI 

put_node b/2,X3 "/. X3 = b( 

put_arc X3,1,X4 "/. X4, 

put_arc X3,2,X2 "/. X2) 

put_node b/2,X4 "/. X4 = b( 

put_arc X4,1,X5 "/. X5, 

put_arc X4,2,X5 "/. X5) 

put_node d/0,X5 "/. X5 = d 
advance_q X3 

Figure 20: Compiled code for the query a([3]dl,[3]), b(b([l]d,[l]),[3]) 



4.1.3 Processing of a Program 

When processing a program, similar modifications must be made; each feature structure in 
the MRS is converted to a set of equations in turn. Prior to every code section (corresponding 
to a single feature structure for which Xi is allocated), an additional new instruction is 
inserted: advance_p Xi, whose code is given in figure 21. 



advance_p X, = 

X, ^ ROOTS [CURRJIOOT] ; 
CURRJIOOT ^ CURRJIOOT + 1 ; 



Figure 21: The advance_p instruction 

Assuming that CURR-ROOT is set to by the initialization of the compilation, this mod- 
ification guarantees that when get_structure will use Xi to match the current program 
feature structure against a feature structure resident in memory, the address of the resident 
feature structure is taken from ROOTS [CURRJIOOT] , i.e., the k-th program feature structure 
is guaranteed to be matched against the k-th query feature structure. Here again, if a pro- 
gram contains a feature structure that is reduced to a tag only, then just the new statements 
that separate consecutive feature structures will be generated, but no special code will be 
generated for the tag itself. 

4.1.4 Multi-rooted Structures as Rules 

Recall that a rule is represented by a MRS, where the rule's head is the feature structure 
defined by the last root of the MRS, and the body is the rest of the roots. We demonstrate 
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feature structures; the two extensions are orthogonal. 

4.1 Sequences of Feature Structures 

A multi-rooted structure can be thought of as an ordered list of (not necessarily disjoint) fea- 
ture structures. The unification of such lists goes along the lines of single feature structures 
unification, and thus our machine doesn't have to be radically modified. 

4.1.1 Representation of a Multi-rooted Structure 

A single feature structure is characterized by one root, whereas a MRS has a sequence of 
roots. Consequently, a new data structure is needed for storing pointers to the roots. We use 
an array called ROOTS to store the addresses of the roots of a query MRS. The i-th element 
ROOTS [i] points to the i-th root of the query. A special purpose register CURR_ROOT is used 
to index the ROOTS array. 

4.1.2 Processing of a Query 

When processing a query, ROOTS [CURR_ROOT] must hold the address of the root of the feature 
structure that is currently being built. Recall that each feature structure is converted to 
a set of equations in turn. The registers are allocated consecutively, so that if, after the 
execution of the k-th feature structure, the last register that was allocated is Xi, then 
the first register to be allocated for the next feature structure is Xj+i. We insert a new 
instruction, advance_q Xi (figure 19), after the code of each feature structure of the query, 
where Xi is the first register that was allocated for the feature structure. 



advance_q X, = 

ROOTS [CURRJIOOT] ^ X, ; 
CURRJIOOT ^ CURRJIOOT + 1 ; 



Figure 19: The advance_q instruction 

Assuming that CURR-ROOT is set to by the initialization of the machine, this modification 
guarantees that the ROOTS array will store pointers to each feature structure in the query after 
its execution. To summarize, processing of a query of the form fsi , fs2 , ■ ■ ■ , fsn produces 
the following code: 

CURRJIOOT ^ ; 

: put_n.ode instructions for fsi 
advance_q; 

: put_n.ode instructions for fs2 

advance_q; 

: put_n.ode instructions for fs„ 

advance_q; 

: put_arc instructions for fsi, . . . , fs„ 

A technicality arises when a MRS contains a term that consists of a tag only. Such a 
tag must have appeared in the same MRS earlier, and therefore a register was allocated for 
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function unif y(addrl ,addr 2 : address) : boolean; 
begin 

addrl ^ deref (addrl) ; addr2 ^ deref (addr2) ; 
if (addrl = addr2) then return(true) ; 
if (HEAP [addrl] = <REF,addrl>) then 
bind(addrl ,addr2) ; return (true ) ; 
if (HEAP[addr2] = <REF,addr2>) then 
bind(addr2 , addrl) ; return (true ) ; 
H-orig ^ H; 

tl ^*(addrl); t2 ^*(addr2); 
case unif y_type [tl ,t2] (addr2) of 
fail: return (fail); 

trivial: bind(addrl ,addr2) ; return (true); 
for i ^ 1 to arity(tl) + 1 do; 
<action,addr> ^- dequeue(Q) ; 
case action of 

copy: HEAP[addr] ^ <REF,addrl+i> ; 

unify: if (not (unify (addr ,addrl+i) ) ) then return(fail) ; 
bind(addrl ,H_orig) ; 
return (true) ; 
end; 



Figure 18: The code of the unify function 



the code generated for unify_type ti,t2 will create in memory, when executed, a feature 
structure of type t, including the REF cells for all the appropriate features oft. 

The generated code for unification of two types assumes that a feature structure of 
the first type, ti, resides on the heap, while a feature structure of the second type, ^2, 
is part of the program and hence isn't realized on the heap. Thus, for every feature of 
t that is appropriate for ti only, execution of the code creates a REF cell that points to 
the corresponding feature in the first structure. For a feature of t that is appropriate for 
^2 executing the code creates a self-referential REF cell, but also enqueues to Q the pair 
<copy,addr> where addr is the address of this feature in memory, thus enabling further 
processing to copy the future feature to addr. If a feature is appropriate for both ti and 
^2, the execution creates a REF cell pointing to the feature of the first structure, and also 
enqueues a pair <unify,addr> to Q. Later processing will unify the values of this feature 
in the two structures. Finally, for a feature that is introduced by t, execution of the code 
creates a fresh structure on top of the heap. 

In order to generate the unif y_type functions, the type hierarchy specification has to be 
processed such that the transitive closure of the subsumption relation is computed. Then, 
a table is generated in which there is, for every two types, an entry that specifies the least 
upper bound of these types. Moreover, this table lists also the features of the unified type 
and their 'origin': whether they are appropriate for ti, t2, both or none of them. 

4 Extensions 

The previous section dealt with a very simple abstract machine, capable of unifying two 
simple TFSs. We now present two extensions of the machine that will allow processing 
more complex entities. Section 4.1 describes the representation and application of rules. 
Section 4.2 details the modifications needed in our design to allow for disjunction within 
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the types dl and d is listed. Note that the function returns 'trivial', rather than 'true', to 
indicate the fact that no new structures were built in memory. This value will be used by 
the function unify below. 



unif y_type [dl ,d] (d_addr) 

HEAP[d_addr] ^ <STR,dl>; "/. since dl U d = dl 

return (trivial) ; 



Figure 17: Code of unif y_type [dl ,d] 

Another example of a trivial case of type unification is that in which the two types are 
not compatible. The instance of unify_type returns 'fail' in such cases. This leads to a call 
to the function fail, which aborts the unification. In section 4.2 we modify the definition 
of failure and allow some sort of recovery. 

In the WAM's original get_structure there was no need to call anything like unif y_type. 
The WAM's equivalent was a simple check to verify that both structures have the same func- 
tor and arity. It is due to the nature of a typed system that a simple equality check has to be 
replaced by a more complex operation. Since type unification adds information by returning 
the features of the unified type, this operation builds new structures, in our design, that 
refiect the added knowledge. Moreover, the WAM's special register S is here replaced by 
a queue. S was used by the WAM to point to the next sub-term to be matched against. 
In our design, as the arity of the two terms can differ, there might be a need to hold the 
addresses of more than one such sub-term. This is what Q is used for - it is being loaded by 
the various unif y_type operations with the addresses of all those sub-terms of the program 
term that have yet to be matched against. 

Note that the unif y_variable instruction resembles very much its WAM analog, in the 
case of read mode. There is no equivalent of the WAM's write mode as there are no real 
variables in our system. However, in unify_value there is some similarity to the WAM's 
modes, where the 'copy' action corresponds to write mode and the 'unify' action to read 
mode. In this latter case we have to call the function unify, just like the WAM does. 

The unify function (figure 18) is very similar to unify_type. In fact, it contains the 
latter and thus uses it as a subroutine. Recall that unif y_type is used to perform the type 
unification of two structures, only one of which is represented on the heap, unify does 
the same, with two differences: first, both feature structures are in memory; second, full 
unification has to be performed. The first difference is the reason for removing an item 
from the queue Q and using it as a part of the unification process; the second is realized by 
recursive calls to unify for subgraphs of the unified graphs. 

The function unify is independent of the types of the structures it operates upon, and 
thus only one copy of it exists. It receives four parameters: two types and the two addresses 
of the corresponding feature structures on the heap. 

3.6 Compilation of the Type Hierarchy 

The heart of the unification process lies in the function unify_type. This function is gener- 
ated by compiling the type hierarchy specification; in fact, there are many such functions, one 
for every pair of types. If the two types are not consistent, their corresponding unify_type 
function simply returns fail. However, for consistent types the function produces not only 
the unified type, but also a feature structure skeleton for this type, that lists all the features 
that are appropriate for the type. For example, if ti and ^2 are such that ti U ^2 = i, then 
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The dereferenced value of Xi, addr, can either be a self-referential REF cell, i.e., an 
unbound variable, or an STR cell. In the first case, the feature structure has to be built by 
the program. A new feature structure is being built on top of the heap (using code similar 
to that of put_structure) with addr being set to point to it. The second case, in which Xi 
points to an existing feature structure of type t ' , is the more interesting one. Here we have 
to unify an existing feature structure with a new one whose type is t. This is the place to 
call the compiled code of unif y_type with t and t ' . 

The operation of unif y_type will be explained in section 3.6. However, it is important to 
understand that as a result of this operation a new global queue, Q, is being added to.® Every 
element of Q is a pair <action,addr> where action is either 'copy' or 'unify'. For each 
feature of the program term, Q determines whether this feature should be simply copied to 
the unified feature structure, or whether it must be first unified with the appropriate feature 
in the other structure. The contents of Q will be used by the two unify instructions. 

The function unify_type, called by get_structure, is generated as a result of the com- 
pilation of the type hierarchy. While the details of this process are given in section 3.6, we 
list below (figure 16) the compiled code for the unification of our two running examples, of 
types a and b. The second of these structures exists on the heap; the first one has yet to 
be generated, as it is part of the program, and we don't represent the program as feature 
structures in memory. Thus the parameters to unify_type, in addition to the two types of 
the arguments, have to include the address of the existing feature structure in memory. 

Note that there isn't one unify_type function, but rather many instances of it, arranged 
as a two-dimensional array indexed by the types of the arguments for the unification. The 
code we list below, therefore, is referenced as unif y_type [a,b] (b^addr), where a and b are 
the types of the arguments, and b_addr is the address of the existing feature structure (of 
type b). 



unif y_type [a,b] (b_addr) 




HEAP[H] ^ <STR,c>; 


y, since a Ub = c 


HEAP[H+1] ^ <REF,H+1>; 


y, the value of fl is yet unknown 


enqueue(Q,copy ,H+1) ; 




HEAP[H+2] ^ <REF,b_addr+l>; 


y, f2 is taken from b 


HEAP[H+3] ^ <REF,b_addr+2>; 


y, f3 is taken from b 


enqueue (Q ,unif y ,H+3) ; 


y but still has to be unified with a 


HEAP[H+4] ^<REF,H+5>; 


y f 4 is a new structure : build it . 


HEAP[H+5] ^ <STR,bot>; 




bind(b_addr,H); 


y = HEAP[b^ddr] ^<REF,H> 


H ^ H + 6; 




return true ; 





Figure 16: Code of unif y_type [a,b] 

The code for the type unification of the types a and b is rather complex. In many 
cases the code is much simpler: for example, when the type of the feature structure that is 
resident in memory is subsumed by the type of the program feature structure, nothing has 
to be done. As another example, if the program's type is subsumed by the query's type, 
then the program's additional features have to be added to the resident term. But if no such 
features exist, the only thing that the function must do is change the type of the resident 
structure. An example of such a case is depicted in figure 17, where the type unification of 



We assume the normal operations on queues, where 'enqueue' adds an item on one end and 'dequeue' 
removes an item from the other end. 
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"/o uninstantiated cell 
y. HEAP[addr] ^<REF,H> 



get_structure t/n.Xj = 

addr ^derefCX,); X, ^addr; 
case HEAP [addr] of 
<REF , addr> : 

HEAP[H] ^ <STR,t>; 

bind(addr,H) ; 

H ^ H+1; 

for j ^ 1 to n do 

HEAP[H] ^ <REF,H> 
enqueue (Q , copy ,H) ; 
H ^ H + 1; 
<STR,t '>: y. a node 

if (unif y_type [t ,t '] (addr) = fail) then fail 

unif y_variable X, = 

<action,addr> ^dequeue(Q); 
Xj ^- addr ; 

unify_value X, = 

<action,addr> ^dequeue(Q); 
case action of 

copy: HEAP [addr] ^*(X,); 

unify: if (unif y (addr .X,) = fail) then fail; 



Figure 14: Implementation of the get/unify instructions 



function deref (a: address) : address; 
begin 

<tag,value> ^ HEAP [a] ; 
while (tag = REF and value ^ a) 
a ^- value ; 

<tag,value> ^ HEAP [a] ; 
return (a) ; 
end; 



Figure 15: Implementation of the deref function 
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put_node t/n.Xj = 

HEAP[H] ^ <STR,t>; 

H ^ H + n + 1; 

put_arc XjjOffsetjXj = 

HEAP [X,+off set] ^<REF,Xj>; 



Figure 11: The implementation of the put instructions 



put_node b/2,Xl 
put_arc X1,1,X2 
put_arc X1,2,X3 

put_node b/2,X2 
put_arc X2,1,X4 
put_arc X2,2,X4 

put_node d/0,X4 

put_node d/0,X3 



y. XI = b( 
y. X2, 

y. X3) 

y. X2 = b( 

y. X4, 

y. X4) 

y. X4 = d 
y. X3 = d 



Figure 12: Compiled code for the query b(b([l] d,[l]),c 



Three kinds of machine instructions are generated when processing an equation of the 
formXjjj =t(Xjj,. . .,Xj„) that is part of a program term. The first instruction is get_struc- 
ture t/n,Xj(j, where n is the arity oft. For each argument Xi ■ oft an instruction of the 
form unif y_variable Xi is generated if Xi is first seen; if it was already seen in the 
current term, unif y_value Xi is generated. 

For example, the machine code that results from processing the program a([3]dl,[3]) is 
depicted in figure 13. The implementation of these three instructions is given in figure 14. ■* 



get_structure a/2, XI 
unif y_variable X2 
unify_value X2 
get_structure dl/0,X2 



y. XI = a( 
y. X2, 

y. X2) 

y. X2 = dl 



Figure 13: Compiled code for the program a([3]dl,[3]). 

The heart of the implementation lies in the functions unify and unify_type. These 
functions perform the actual unification of the two feature structures. The get_structure 
instruction is generated for a feature structure fsp of a type t which is associated with a 
register Xi. It matches fsp against a feature structure fsq that resides in memory using 
Xi as a pointer to the address of fsq on the heap. Since fsq might have undergone some 
type inference or previous binding (for example, due to previous unifications caused by other 
instructions), the value of Xi must first be dereferenced. This is done by the function deref 
(figure 15) which follows a chain of REF cells until it gets to one that either points to an 
STR cell or is self-referential. This is the value it returns. 



We use the operator '*' to refer to the contents of an address or a register. We also use ' + + ' as an 
'increment' operator, and '--' as 'decrement'. 
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flatten (fs) : 










i ^ 1; flattenl (fs) ; 










flattenl (fs) : 










if fs consists of a tag [j] only 
else let fs be D]t(/i, /2, • • • , /n) 
if Reg[j] is not defined 


, return 


Reg 


[J] 


Reg[j] ^ i; 










for k ^ 1 to n do 










jk ^ flattenl(/fc); 

print Xiieg[j] = i(XReg[ji], ■ ■ ■ , 

return Reg[j] ; 


^Regbn]) 







Figure 9: The algorithm for flattening terms 



Linear representation: 


Set of equations 


a([3]dl,[3]) 


XI = a(X2,X2) 
X2 = dl 


b(b([l]d,[l]),d) 


Xl = h{X2,Xi) 
X2 = &(X4,X4) 
XA = d 
X3 = d 



Figure 10: Feature structures as sets of equations 



put_arc Xig , j, Xi ■ is generated. put_n.ode creates a representation of a node of type t 
on top of the heap and stores its address in Xi^ ; it also increments H to leave space for the 
arcs. put_arc fills this space with REF cells. 

In order for put_arc to operate correctly, the registers it uses must be initialized. Since 
only put_n.ode sets the registers, all put_n.ode instructions must be executed before any 
put_arc instruction is. Hence, we maintain two separate streams of instructions, one for 
put_n.ode and one for put_arc, and execute the first completely before moving to the other. 
This compilation scheme is called for by the cyclic character of feature structures: as ex- 
plained in [3], the original single-streamed WAM scheme would fail on cyclic terms. 

The implementation of the two instructions is given in figure 11. Figure 12 lists the ma- 
chine code that results from compiling the term b(b([l]d,[l]),d). When this code is executed 
(first the put_n.ode instructions, then the put_arc ones), the resulting representation of the 
feature structure in memory is the one shown above in figure 8. 

3.5 Processing of a Program 

Unlike the WAM, in our framework registers that are set by the execution of a query cannot 
be helpful when processing a program. The reason is that there is no one-to-one corre- 
spondence between the sub-terms of the query and the program, as the arity of the feature 
structures can be different even when the structures are unifiable. We still use the Xi reg- 
isters, but (with the exception of Xi) their old values are not used during execution of the 
program. 
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for. As we use a total typing system, the arity of a type is the number of arcs leaving a 
node of that type. This number is constant for all feature structures of this type; hence, we 
can keep the WAM's convention of storing all the outgoing arcs from a node consecutively 
following the representation of the node. Given a type and a feature name, we can statically 
determine the position of the arc corresponding to this feature is a specific feature structure 
of the given type; the subgraph that is the value of this feature can be accessed in one step. 
This is a major difference between our method and the approach presented in [3]; we believe 
that it leads to a more efficient system without harming the elegance of the machine design. 

It is important to note that STR cells differ from their WAM analogs in that they can 
be dereferenced when a type is becoming more specific. In such cases, a chain of REF cells 
leads to the dereferenced STR cell. Thus, if a feature structure is modified, only its STR 
cell has to be changed in order for all pointers to it to 'feel' the modification automatically. 

We keep the WAM's convention of representing an uninstantiated variable as a self- 
referential REF cell. Such a variable stands for a feature whose value is temporarily un- 
known. This is different from the Prolog definition of uninstantiated variables, as in our 
system there is always at least partial information as to the type of a structure. 

Each node is represented using one cell, and each arc consumes one cell as well. So 
for representing a graph of n nodes and m arcs, n + m cells are needed. Of course, during 
unification nodes can become more specific and a chain of REF cells is added to the account, 
but the length of such a chain is bounded by the depth of the type hierarchy and dereferencing 
cuts it occasionally. 

As an example. Figure 8 depicts a possible heap representation of the feature structure 
b(b([l]d,[l]),d), starting from address 1. 



1 


STR 


b 


2 


REF 


4 


3 


REF 


8 


4 


STR 


b 


5 


REF 


7 


6 


REF 


7 


7 


STR 


d 


8 


STR 


d 



Figure 8: Heap representation of the feature structure b(b([l]d,[l]),c 



3.3 Flattening Feature Structures 

In order to represent a graph on the heap, its linear representation (as a normal term, see 
section A. 2) is transformed to a set of "equations" , each having a fiat format, i.e., no nesting 
is allowed. To facilitate this we use a set of registers {Xi} that store addresses of feature 
structures in memory. We associate a register i with each tag [j] of a normal term. Let 
RegEj] be the register associated with the tag [j]. The algorithm for fiattening a linear 
representation is given in figure 9. In figure 10 there are examples of the equations sets 
corresponding to each of the example feature structures. 

3.4 Processing of a Query 

When processing an equation of the form Xi^ = t(Xi-^, Xi^, . . .), representing part of a 
query, two different instructions can be generated. The first instruction is put_n.ode t/n, 
Xig, where n is the arity oft. Then, for every argument Xi-, an instruction of the form 
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Figure 7: The modified MRS 



queries. Eacli query is compiled before its execution; the resulting code is executed prior to 
the execution of the compiled program. 

Processing of a 'query' is aimed towards building a graph representation of the query 
in the machine's memory. The processing of a 'program' must produce code that, during 
run-time, will unify the program with a query already resident in memory. The result of the 
unification will be a new feature structure, represented as a graph in the machine's memory. 

3.1 First-Order Terms and Feature Structures 

While TFSs resemble FOTs in many aspects, it is important to note the differences between 
them. First, TFSs are typed, as opposed to (ordinary) FOTs. Types can be captured 
as labeling the nodes of a feature structure. In addition, TFSs label the arcs by feature 
names, whereas FOTs use a positional encoding for argument structure. A more important 
difference is the ability to share values within TFSs: while FOTs are essentially trees, with 
possibly shared leaves, TFSs are directed graphs, i.e., variables can occur anywhere within 
a feature structure. Moreover, our system doesn't rule out cyclic structures, so that infinite 
terms can be represented, too. 

FOTs are said to be consistent only if they have the same functor and the same arity. 
TFSs, on the contrary, can be unified even if their types differ (as long as they have a 
non-degenerate least upper bound). Moreover, their arity can differ, and the arity of the 
unification result can be greater than the arity of any of the unificands. 

These differences are the reasons for many diversions from the original WAM that were 
necessary in our design. In the following sections we try to emphasize the points where such 
diversions were made. 



3.2 Representation of Feature Structures in Memory 

Following the WAM, we use a global, one-dimensional array called HEAP of data cells, where 
a cell's address is its index in the array. A global register H points to the (current) top 
element of HEAP. Data cells are tagged: STR cells correspond to nodes, while REF cells 
correspond to arcs. Hence, an STR cell contains the type associated with the node it stands 
for, and a REF cell contains the address of the node that is the target of the arc it stands 



consists of the two leftmost ones. Note that the feature structures in the rule share some 
nodes; for instance, the node whose type is dl is common to all three feature structures. A 
possible linear representation for p is: 

p : b(b([2]d,[2]),[4]dl), a([4],[4]) => b(b([2],[4]),d2) 

In figure 6 a MRS a, consisting of two feature structures, is described. We might represent 
a linearly as: 

a: a([3]dl,[3]), b(b([l]d,[l]),[3]) 

When rule p is applied to a, p's body is unified with a. In figure 7 we list the MRS consisting 
of the new, unified body and the modified head. 




Figure 5: Graphical representation of the rule p 




Figure 6: A multi-rooted structure a 



3 The Basic Machine 



The heart of the machine design is concentrated on unifying two feature structures. Following 
the WAM we call one of them program and the other - query. Both the program and the 
query are compiled. The program is compiled once, just like an ordinary program, to produce 
machine instructions. One program is usually designed to be executed against many different 
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2.4 Representation of Rules 

A multi-rooted structure (MRS) is a connected, directed, labeled, finite graph with an 
ordered non-empty set of distinguished nodes, roots. We use MRSs to represent rules, 
where the graph that is reachable from the last root is the head^ of the rule, and those 
that are reachable from the rest of the roots form the body of the rule^. A MRS is linearly 
represented as a sequence of terms, separated by commas, where two occurrences of the 
same tag, even within two different terms, denotes reentrancy (that is, the scope of the tags 
is the entire sequence of terms). The head is preceded by '=>' rather than by a comma. See 
Appendix A. 6 for the exact details. 

Application of a rule amounts to unifying its body with a multi-rooted structure resident 
in memory and producing its head as a result. Since the head and the body of the rule 
might share values, unifying the body with the existing MRS might infiuence the head as 
well. Thus, the head that is produced as a result of the rule application depends also on the 
resident MRS. The formal definition of rule application is given in Appendix A. 6. 

An example of a rule, p, is graphically depicted in figure 5. In this example the rule 
consists of a MRS of length three, the roots of which are the grey nodes displayed as the 
uppermost nodes. The head of the rule is the rightmost feature structure, while the body 



This meaning of head must not be confused with the hnguistic definition, referring to the core features 
of a phrase. 

Notice that the intuitive direction is reversed: this will ease later processing of the rules. 



the type a, while the second is inherited from b. The third feature, fS, is common to both 
a and b, while the last feature is a new one, introduced by c. In our representation of types 
we always assume that the features are totally ordered and that their order is given as part 
of the specification. In the examples below we assume that the order is the lexicographic 
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Figure 1: The type-unification table for H 



2.3 Representation of Feature Structures 

The most convenient graphical representation of feature structures is attribute- value matri- 
ces (AVMs). However, to represent a (totally well-typed) feature structure linearly we use 
a notation that resembles a first-order term, where the type plays a similar role to that of 
a function symbol and the features are listed in a fixed order. Reentrancy is implied by 
attaching identical tags to reentrant feature structures. This representation is based upon 
Ai't-Kaci's i/i-terms ([4, 6]); its definition and semantics are given in Appendix A. 2. 

Total well-typedness means that the names of the features in a feature structure can be 
coded by their position, and thus feature-names are omitted from the linear representation. 
They can be recovered from the type and the positions of the features in the argument list 
of the term. 

We will later on use two feature structures, A and B, to exemplify the machine instruc- 
tions and its operation. These structures described below, in Figures 2 and 3, represented 
as an AVM, as a linear term and as a graph (where a grey node denotes a root). 



AVM description: 



Graph representation: 



/I : [3] [ dl 

/3: [3] 



Linear representation: a([3]dl,[3]) 

Figure 2: An example feature structure A 




The basic operation performed on feature structures is unification. There are various 
definitions for feature structure unification, and we base our unification algorithm (Defini- 
tion A. 17) on Carpenter's definition ([9]). The exact details are given in Appendix A. 4. An 
example of the unification operation is given in figure 4 below, where the arguments are the 
example feature structures A and B. 



every feature is introduced by some least type (and is appropriate for all the types it sub- 
sumes), and that appropriateness be monotone. A last requirement of the appropriateness 
specification is that it does not contain loops. For a formal presentation of the above notions 
see Appendix A.l. 

Third, we require that the feature structures with which we deal be totally well-typed. 
This property is most convenient and results in more efficient processing. It is more prob- 
lematic from the user's point of view, as users might find it useful to specify only partial in- 
formation about linguistic entities. Therefore, some description language must be provided, 
such that the user is able to give partial descriptions from which totally well-typed feature 
structures can be automatically deduced. Various description languages were suggested for 
feature structures in general and TFS in particular. In many cases the manipulation of 
the structures (e.g., unification) is defined for the description rather than over the objects 
themselves. As there are efficient algorithms to deduce structures from their descriptions, 
we prefer not to commit ourselves to one description language. We define our system over 
explicit representations of TFS, as will be clear from section 2.3. 

2.2 Type Specification 

The first part of a program (or a grammar) is a type specification. As described above 
this should contain the type hierarchy and the appropriateness specification. We adopt 
Carpenter's format ([8]) for this specification: it is a sequence of statements of the form 

t sub [tiyt'j, • • -ytn] intro [/i : ri, . . .,/„ : r^]. 

where t,ti, . . . ,t„,ri, . . . ,rm are types, /i, . . . , /„ are feature names and n,m> 0. 

Such a statement, which is said to characterize t, means that ti, . . . ,t„ are subtypes of 
t (i.e., for every i,l < i < n,t Q ti), and that t has the features /i , . . . , /„ appropriate for it. 
Moreover, these features are introduced by t, i.e., they are not appropriate for any type t' 
such that t' \Zt. Finally, the statement specifies that the appropriate values for each feature 
fi in t should be of type r,-. We demand that each type (except T and _L) is characterized 
by exactly one statement. 

The full subsumption hierarchy of the types is the refiexive transitive closure of the C 
relation as specified by the characterization statements. If this relation is not a bounded 
complete partial order, the specification is rendered invalid. The same is true in case it is not 
an appropriateness specification (see Definition A. 3) or contains a loop (see Definition A. 6). 

We use the following type hierarchy _ff as a running example: 

bot sub [a,b,d] . 

a sub [c] intro [f 1 :bot ,f 3 :dl] . 

c sub [] intro [f4:bot]. 
b sub [c,e] intro [f 2 :bot ,f 3 :d] . 
d sub [dl,d2] . 

dl sub [] . 

d2 sub [] . 

The type bot stands for _L in this specification. The type T is systematically omitted from 
type specifications. 

The type-unification (or least upper bound) table, consisting of an entry for every pair 
of types, can be computed at compilation time. The appropriate table for the hierarchy _ff is 
depicted^ in figure 1. Note that the table encodes not only the features of the unified type, 
but also the 'origin' of these features. For example, the table entry for the types a(flJ3) and 
h(f2J3) is c(fl,f2,f3,f4)- This entry states that the first feature of the type c is inherited from 



-^ As such tables are always symmetric, only their upper part is shown. 



though there were prior interpreters and compilers for Prolog, it was the Warren Abstract 
Machine (WAM) that gave the language not only a good, efficient compiler, but, perhaps 
more importantly, an elegant operational semantics. 

The WAM implementation of Prolog consists of a machine, augmented by a compiler 
into its instruction set. The meaning of each instruction is defined using a low-level language 
that can be mapped to any ordinary hardware. In fact, there is even a formal verification 
of the correctness of this implementation ([21]). 

The WAM immediately became the starting point for many Prolog compilers. The 
techniques it delineates serve not only for Prolog proper, but also for constructing compilers 
for related languages. To list just a few examples, abstract machine techniques were used for 
a parallel Prolog compiler ([16]), for variants of Prolog that use different resolution methods 
([24]), and for a general theorem prover ([22]). 

A careful design of such an abstract architecture must compromise between two, usually 
confiicting, requirements: it must be close enough to the high-level language in order to 
capture its semantics and to accommodate simple compilation; on the other hand, it must 
remain close to common architectures so that its language can be efficiently executed. 

1.3 Structure of the Document 

The next section sketches the fundamental notions needed for understanding our design. 
We define type hierarchies, feature structures, well-typedness conditions and unification. 
We also give a FOT-like representation of TFSs. Then we extend the definitions to license 
sequences of TFSs. Section 3 describes the characteristics of the basic abstract machine we 
design: feature structure representation in memory, fiattening an FOT-like TFS and simple 
unification of two TFSs. We also detail the process of compiling the type hierarchy. Section 4 
presents extensions of this basic abstract machine: in section 4.1 we extend the machine so 
that it can handle sequences of feature structures; section 4.2 introduces disjunctive TFSs 
and delineates the changes in the design that are needed in order to manipulate them. 
A conclusion and plans for further research are given in section 5. Appendix A provides a 
more detailed mathematical background, in the lines of [9]. Appendix B lists all the machine 
instructions and auxiliary functions. 

2 The Framework 

2.1 Fundamental Notions 

An HPSG grammar consists of a type specification (the signature) and grammar rules 
(including principles and lexical rules). The basic entity of HPSG is the feature structure 
which is a connected, directed, labeled, possibly cyclic, finite graph, whose nodes are deco- 
rated with types and whose edges are labeled by features. The types are ordered according 
to an inheritance hierarchy where higher types inherit features from their super-types. 
A formal definition of types and feature structures is given in Appendix A.l. 

As there are many different formalizations of TFS systems it is important to define the 
framework with which we work; it is, with slight modifications. Carpenter's system ([8, 10]). 
First, we use a set of types that includes both _L, the least type, and T, the greatest one. 
We order types by subsuniption according to their information content, not according to 
the cardinality of the set of elements they can be assigned to. This means that the type _L 
is the most general type, subsuming every other, and the type T is the contradictory type, 
subsumed by every other. In this we follow, e.g., [10, 9] but not [5, 23]. 

Second, we require that the type inheritance hierarchy be bounded complete, that is, 
every set of consistent types must have a unique least upper bound (other than T). We 
also require that the appropriateness specification of the features and the types be such that 



In designing the machine we try to capture the intuitive meaning of the linguistic formalism 
and to reflect it in the machine architecture. The operational semantics of each instruction 
is defined using a low-level language that can be executed on ordinary hardware. We thus 
expect a substantial improvement over existing parsers in both space and time requirements. 
Recently, a similar approach was applied to the LIFE language ([3]); however, due to dif- 
ferences in the motivation and in the formalisms, our machine is much different. As far 
as we know, this is the first attempt to use abstract machine for a linguistically motivated 
formalism. 

The use of an abstract machine ensures that every grammar specified using our system 
is endowed with a concrete, well defined operational meaning. We thus provide a means for 
rigorously stating mathematical properties of specific grammars as well as entire formalisms. 
For example, it will enable to formally verify the correctness of a compiler for HPSG, given 
an independent definition; or to prove the equivalence of two HPSG grammars. 

1.1 Related Works 

The first language to combine feature structures, typing hierarchies and constraint specifica- 
tion is probably LOGIN ([4]). In this system FOTs are replaced by the more general ip-terms 
and a partially-ordered set of types introduces a built-in inheritance to the language. A nat- 
ural descent of LOGIN is LIFE ([2, 5]), where styles from functional programming are added 
to the basic constructs. An abstract machine for LIFE is currently under development and 
preliminary results were very recently described in [3]. 

Another effort, motivated by linguistic needs, is TFS ([26]); it uses a partially ordered 
set of types and allows general constraints over typed feature structures to be specified. 
An abstract rewrite machine is employed to resolve these constraints in no a-priory order, 
i.e., grammars can be used for both parsing and generation. This leads to very inefficient 
processing. 

Recently two other systems were constructed that allow the specification of logical con- 
straints over typed feature structures. ALE ([8]) is meant to be a general logic engine, where 
definite clauses over typed feature structures can be specified. In addition, the language 
provides means for specifying phrase structure rules, and a parsing algorithm is embedded 
within it. However, ALE limits the type hierarchy to be a bounded complete partial order, 
where unique unifiers exist for every consistent subset of types. 

CUE ([11]) is more ambitious than the former systems as it is aimed to cover as many as 
possible of the extensions to simple unification formalisms. It supports partial ordering of 
types but doesn't demand it; feature structures can contain disjunction, negation, list- and 
set-values and arbitrary functional and relational constraints. Separate control statements 
guide the resolution process. 

What is common to all the above-mentioned systems is that even though they can be 
used to specify natural language grammars, and indeed many of them were motivated by 
linguistic applications, they are very general and completely independent of any particular 
linguistic theory or formalism. Many of these frameworks were used for devising grammars 
for HPSG, but it is hard to compare the grammars that were designed for the different 
systems. 

1.2 Abstract Machines 

Abstract machines were used for various kinds of languages: starting from Landin's SECD 
([18]), many compilers for functional languages were designed this way. Even imperative 
languages such as Pascal were implemented using abstract machine techniques (P-Code). 
When logic programming languages appeared, such techniques were applied to them as well. 
Notably, Warren designed an abstract machine for the execution of Prolog ([25], [1]). Even 
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Abstract 

This paper describes a first step towards the definition of an abstract machine for 
linguistic formalisms that are based on typed feature structures, such as HPSG. The 
core design of the abstract machine is given in detail, including the compilation process 
from a high-level specification language to the abstract machine language and the 
implementation of the abstract instructions. We thus apply methods that were proved 
useful in computer science to the study of natural languages: a grammar specified 
using the formalism is endowed with an operational semantics. Currently, our machine 
supports the unification of simple feature structures, unification of sequences of such 
structures, cyclic structures and disjunction. 

1 Introduction 

Typed feature structures (TFSs) serve as a means for the specification of linguistic informa- 
tion in current linguistic formalisms such as HPSG or Categorial Grammar ([19, 20, 15]). 
Seen as a generalization of first-order terms (FOTs), TFSs are also used to specify logic pro- 
grams and constraint systems in frameworks such as LOGIN ([4]), LIFE ([2]), ALE ([10, 8]), 
CUE ([11]), TES ([26]) and others. Many general frameworks that are completely indepen- 
dent of any linguistic theory can be used to specify grammars for natural languages. Indeed, 
most of the above mentioned languages were used for specifying HPSG grammars. Different 
systems employ different kinds of TFSs, based on a variety of algebraic definitions, and we 
usually follow the representation of [9] here. 

Linguistic formalisms (in particular, HPSG) use TFSs as the basic blocks for representing 
linguistic data: lexical items, phrases and rules. They usually do not specify a mechanism 
for manipulating TFSs: parsing algorithms, for instance, are external to the formalism. 
Moreover, HPSG has no formal definition yet, so that it is not fully determined what the 
characteristics of the formalism are, nor what the properties of a specific HPSG grammar 
are. In general constraint solvers based on TFSs the operations performed on the structures 
are more explicit, though such systems usually suffer severe efficiency problems. When no 
processing direction is specified, and the system searches the complete space of solutions for 
some specification, its performance is disappointing. Clearly, efficient processing calls for a 
different method. 

In this paper we present a first step of an approach for processing TFSs that guarantees 
both an explicit definition and high efficiency. Viewing grammars for natural languages as 
formal specifications in a high-level programming language, we incorporate techniques that 
were proved valuable in computer science, especially in programming languages semantics, 
thus utilizing the benefits of bringing together the two paradigms: computer science and 
linguistics. Our main aim is to provide an operational semantics for TFS-based linguistic 
formalisms, especially HPSG. We adopt an abstract machine approach for the compilation 
of specifications of such grammars. The abstract machine comprises data structures and 
instructions, augmented by a compiler from the TFS formalism to the abstract instructions. 
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